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Abstract 


his paper describes the initial stages of a project which seeks 
T: develop a language learning game for Italian running a deep 
linguistic grammar at its backend for fine-grained error detection. The 
grammar is designed within the grammatical framework of Lexical 
Functional Grammar (LFG). The project aims to bring together work 
from different fields by combining strategies from computational 
linguistics with theoretical insights from Second Language Acquisition 


(SLA) and components from computer gaming. 
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1. Introduction 


In recent years, many tools have been developed to facilitate language learning. 
Aside from commercial products such as Duolingo (Teske, 2017) or HelloTalk 
(Rivera, 2017), grammar checkers based on linguistic frameworks have been 
implemented as well. These include a grammar checker for German (Fortmann 
& Forst, 2004) based on a large-scale LFG grammar, and Arboretum (Bender 
et al., 2004), a tutorial system for English using Flickinger’s (2000) English 
Resource Grammar at its backend. 
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While most commercially available systems rely on _ pattern-matching 
algorithms, grammar-based tools allow a more fine-grained error detection. 
Instead of comparing an input string to pre-programmed answers, a deep 
linguistic grammar analyses an input string morphologically and syntactically. 
The system developed here combines an LFG based grammar for Italian with 
a large-scale lexicon covering a wide range of Italian vocabulary. The grammar 
tool allows for the detection of errors and forms the building block for generating 
feedback to the learner. The actual learning process is guided by insights from 
processability theory (Bettoni & Di Biase, 2015; Pienemann, 2005), a theory 
of SLA that focusses on language development over time by analysing which 
forms from a second language are processable at which developmental stage 
(Pienemann, 2005). A recent publication discussing the development of Italian 
as a second language (Bettoni & Di Biase, 2015) is hereby fundamental for 
the exercise design and the order in which they are presented to the learners to 
ensure a successful language learning experience. 


This paper aims to present the current state of the project by introducing the 
components that form the backbone of the computer-assisted language learning 
tool: the LFG based grammar and how its architecture and components are of 
great benefit. 


2. Method 


Crucial to the tool is the combination of the following components: (1) 
concepts from Optimality Theory (OT) (Frank, King, Kuhn, & Maxwell, 1998) 
combined with error rules, (2) the generation component of the Xerox Linguistic 
Environment (XLE) (Crouch et al., 2008), and (3) the lexicon. Figure | illustrates 
the system’s architecture and how these components feed into the language 
learning software. 


The user interface is being fed by the LFG based grammar, alongside learning 


material. The grammar is responsible for evaluating free user input, generating 
feedback, and a corrected sentence or structure in case of erroneous input. The 
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user is thus presented with an exercise, inputs their answer, and receives feedback. 
This information is then passed on to update the user data and subsequent tasks 
are adapted accordingly. 


Figure 1. Architecture 


a System components 
3.1. Lexical Functional Grammar 


LFG is a lexicalist, non-derivational theory of grammar (Dalrymple, 2001; 
Kaplan & Bresnan, 1982). In contrast to other generative grammars, LFG 
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assumes parallel representations for sentences, each with its own structure 
and vocabulary, and adhering to its own constraints. Constituent structure and 
functional structure (c- and f-structure) are the two main representations for 
sentences. While a c-structure depicts hierarchical relations, constituency, and 
linearity in the shape of a syntactic tree, an f-structure captures grammatical 
functions and semantic notions such as tense and aspect in an attribute- 
value matrix. Having a strong mathematical architecture, LFG is not only 
implementable but also efficient and fast in analysing input. The Italian 
grammar in this project is implemented with XLE, a platform commonly used 
to implement LFG based grammars. 


3.2. OT marks 


OT marks allow the statement of preferences and dis-preferences in sentence 
analysis and can be ordered according to their relative importance. As a result, 
these marks enable the system to deal with ambiguous and ungrammatical input. 
The mark ungrammatical is added to error rules in the grammar, allowing the 
parser to analyse ungrammatical sentences. Additionally, information on the 
error type can be added, passing this information on to an f-structure. Consider 
the subsequent sentence: 


° Peter mangi-o* una mela. 
¢ Peter.3PSG eat-1PSG an apple. 
e ‘Peter eats an apple’. 


This is an example of erroneous subject-verb agreement. Parsing the sentence 
with the LFG grammar yields the following output (Figure 2). 


While the c-structure on the left illustrates the constituents and the hierarchical 
structure, the f-structure contains semantic information on the main predicate 
and its arguments. Additionally, it returns the information that subject-verb 
agreement is ungrammatical in this example. This information is passed on to 
the f-structure by adding certain annotations to the OT mark ungrammatical. 
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Figure 2. C- and f-structure of Example | 
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3.3. | XLE generator 


The generator component of XLE enables the system to create a sentence given an 
f-structure as input. Generation being the reverse of parsing, the same grammar 
produces a string based on a certain f-structure analysis. Here, the generator is 
used to provide a grammatical sentence given an ungrammatical input. Going 
back to Example 1, the XLE generator takes as input the f-structure in Figure 2 
and produces the grammatical alternative depicted in Figure 3. 


Figure 3. Grammatical alternative to Example | as generated by XLE 


% parse "Peter mangio una mela." 

parsing {Peter mangio una mela.} 

1 solutions, @.0@3 CPU seconds, @.170MB max mem, 3@ subtrees unified 
a 

% 

Peter mangia una mela. 
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The first line illustrates the ungrammatical sentence that was passed on to the 
system, while the last line shows the grammatical alternative with subject-verb 
agreement satisfied. 


3.4. Lexicon 


The lexicon constitutes the third big building block of the tool. It was created by 
converting the Morph-it! lexicon (Zanchetta & Baroni, 2005) into a finite state 
morphological analyser using the Xerox Finite State Tool (Beesley & Karttunen, 
2003). The lexicon contains 34,968 lemmas and an overall count of 504,906 
entries. Not only does it provide the grammar with a large Italian lexicon, but 
additionally with morphological analyses that expand from the c-structure, as 
depicted in Figure 4. 


Figure 4. Integrating a finite state morphological analyser 


"Peter mangio una mela." 
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4. Conclusions 


This paper outlines an on-going project on the development ofa language learning 
game for Italian. The tool is based on a deep grammar within the framework 
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of LFG and integrates an Italian lexicon. While the grammar component is 
responsible for analysing the user input and returning adequate feedback, the 
learning process is to be guided by insights from the processability theory. At 
this initial stage of development, the grammar component can detect error types 
and return the grammatical alternatives. Next steps include further expansion 
of the grammar and its syntactic structures, an evaluation of the grammar 
using learner corpora, and the development of learning material and learning 
exercises. The final stages of the project are then to incorporate the grammar 
and the learning material into an attractive and user-friendly environment, 
including gaming components to enhance the learning experience. 
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